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1.  Introduction 


Estimators  for  the  equations  of  motion  of  dynamical  systems  are  of  interest 
for  a  number  of  applications,  including  control  of  chaos  [1]  and  model  iden¬ 
tification  for  signal  processing.  Estimation  of  the  differential  equations  of 
motion  for  a  continuous-time  system  is  complicated  by  a  number  of  factors, 
even  if  the  topology  of  the  system  is  well  understood.  These  factors  include 
the  availability  of  only  undersampled  and  noisy  data,  and  the  need  for  high- 
order  basis  expansions  to  fit  the  differential  equations  in  a  coordinate  delay 
embedding.  Fortunately,  a  sufficient  description  of  a  dynamical  system  can 
often  be  made  using  only  low-dimensional  discrete-time  return  maps.  This 
is  the  case  with  the  highly  successful  techniques  recently  developed  for  the 
control  and  targeting  of  chaotic  dynamical  systems  [1,2].  The  return  map  is 
usually  estimated  with  the  use  of  bin  averaging,  or  function  fitting,  which 
yields  sufficient  accuracy  for  many  purposes.  For  technological  applica¬ 
tions,  however,  the  ability  to  quickly  and  accurately  estimate  return  maps  in 
the  presence  of  noise  with  drifting  or  intentionally  altered  system  param¬ 
eters  becomes  important.  A  technique  for  doing  this  for  a  one-dimensional 
chaotic  map  with  known  functional  form  is  considered  here.  This  estima¬ 
tion  technique  can  be  regarded  as  coherent  parameter  detection,  as  opposed 
to  other  methods  which  do  not  look  at  long-range  correlations  in  the  data 
sequence.  The  application  of  the  technique  is  demonstrated  using  the  logis¬ 
tic  map,  and  it  is  shown  that  the  estimator  is  efficient  for  trajectories  where 
an  efficient  estimator  exists.  The  technique  should  be  extendible  to  higher¬ 
dimensional  systems,  without  major  complications. 

To  conclude  the  report,  a  more  general  result  is  considered:  the  interpreta¬ 
tion  of  a  certain  entropy-like  quantity  as  a  parameter  information  acquisi¬ 
tion  rate.  This  result  is  obtained  by  considering  the  exponential  conver¬ 
gence  rate  of  the  efficient  estimator.  Thus,  the  detection  of  the  map  param¬ 
eter  using  the  efficient  estimation  technique  can  be  thought  of  as  obtaining 
parameter  information  at  a  finite  rate. 


2.  Discrete-Time  Chaotic  Dynamical  Systems 

In  a  discrete-time  dynamical  system,  the  time  parameter  takes  on  integer 
values.  The  system  is  assumed  to  have  definable  and  distinct  configurations 
or  states  that  can  be  represented  by  a  state  point  or  vector  in  a  state  space. 
Mathematically,  a  discrete-time  system  is  usually  described  by  a  mapping 
of  previous  state-space  vectors  into  the  next  one,  with  time  as  an  integer 
index: 

x.„  -  F(X..X.-,„X,)  •  (1) 
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Examples  of  discreie-time  systems  are  the  difference  equations  of  linear 
system  theory,  and  iterated  functions  mapping  a  space  into  itself.  A  first- 
order  discrete-time  system  is  one  whose  current  state  vector  depends  only 
on  the  previous  one.  A  first-order  system  with  system  noise  present  is  de¬ 
scribed  by  the  stochastic  map 

X.+I  =  FX„+£+I  ,  (2) 

where  £  is  a  random  vector.  Each  noise  vector  is  therefore  propagated  for¬ 
ward  in  time  under  the  action  of  the  map.  A  first  order  system  with  only 
additive  noise  present  is  described  by 

X..,=F(X. (j, 

The  noise  vector  is  therefore  additive  but  is  not  propagated  forward  by  the 
map.  Only  additive  noise  is  considered  in  the  derivation  and  evaluation  of 
the  estimators  here. 

Some  maps  are  chaotic;  that  is,  they  display  a  behavior  that  is  complex  and 
aperiodic.  For  concreteness,  the  development  in  this  report  concentrates  on 
a  classic  example  of  a  chaotic  map,  the  logistic  map.  The  logistic  map  [3],  a 
functional  transformation  of  the  unit  interval  on  the  real  axis  into  itself,  is 
described  by  the  equation 


*,♦1  =  •  W 

For  some  values  of  the  parameter  6,  the  sequence  of  points  generated  by  the 
map  is  chaotic  for  almost  all  initial  points.  Chaotic  sequences  have  the  fol¬ 
lowing  properties:  Sequences  generated  starting  with  two  nearby  points  di¬ 
verge  from  each  other  exponentially  (called  sensitivity  to  initial  condi¬ 
tions),  the  sequences  are  aperiodic,  and  the  points  are  distributed  on  the  real 
line  with  a  natural  probability  density  pix).  Figure  1  shows  the  graph  of  the 
logistic  map  for  6  -  3.7,  along  with  the  tenth  iterate  of  the  map.  The  ex¬ 
treme  sensitivity  to  initial  conditions  is  apparent  in  the  graph  of  the  tenth  it¬ 
erate.  Changing  xQ  slightly  has  a  large  effect  on  the  tenth-iterated  value.  The 
global  chaos  is  not  dependent  on  initial  conditions;  that  is,  the  conditional 
probability  density  p(x\xq)  =  p(x)  for  almost  all  initial  points  xQ. 

The  logistic  map  was  investigated  in  depth  by  Feigenbaum  [4]  and  shown 
to  be  related  in  many  important  ways  to  almost  all  single-hump  maps  of  an 
interval  of  the  real  axis  into  itself.  In  this  report,  the  detailed  behavior  of  the 
map  is  not  considered.  One  important  aspect  of  the  behavior  of  the  logistic 
map  that  is  relevant  to  the  parameter  estimation  problem,  however,  is  the 
qualitatively  different  behavior  that  occurs  for  different  values  of  the  pa¬ 
rameter.  As  the  parameter  0  is  continuously  varied  from  0  to  4  through  the 
positive  real  numbers,  the  nature  of  the  sequence  of  iterates  changes  dra¬ 
matically.  Figure  2  is  a  plot  of  the  points  that  are  generated  by  the  map  as  a 


Figure  I.  Logistic  map 
for  6  =  3.7:  first  and 
tenth  iterate. 


Figure  2.  Bifurcation 
diagram  (density 
versus  control 
parameter)  for  logistic 
map. 


0 

function  of  the  control  parameter  6  for  250  iterates.  This  figure  can  be 
viewed  as  showing  the  density  of  the  map-generated  points  on  the  unit  in¬ 
terval  versus  the  control  parameter.  Darker  bands  thus  correspond  to  higher 
density,  and  lighter  regions  to  lower  density.  (The  map  was  iterated  before 
plotting  to  allow  the  decay  of  transients.  This  type  of  plot  is  generally  called 
a  bifurcation  diagram.) 

The  features  of  the  plot  that  are  important  to  the  estimation  problem  are  as 
follows:  First,  as  the  control  parameter  is  varied  from  0  to  a  critical  value  of 
about  3.57,  the  generated  sequence  is  periodic,  with  die  period  length  in¬ 
creasing  in  abrupt  jumps  as  the  control  parameter  is  increased.  This  behav¬ 
ior  is  called  period  doubling  or  bifurcation.  Past  this  critical  value,  the  se¬ 
quence  is  aperiodic  but  still  distributed  over  bands.  As  Q  is  increased,  the 
bands  move  together  until  the  density  becomes  connected  over  a  single  in¬ 
terval.  The  natural  probability  density  for  6  =  3.7  is  shown  in  figure  3.  At 


- '  r 
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this  value  of  the  control  parameter,  the  density  is  connected  over  one  inter¬ 
val,  but  there  are  singularities  everywhere.  Also  important  for  the  parameter 
estimation  problem  is  the  existence  of  periodic  windows  in  the  chaotic  re¬ 
gions.  The  precise  distribution  of  chaotic  and  periodic  parameter  values  is 
very  complex  (the  set  of  chaotic  parameter  values  is  a  fractal),  but  the  im¬ 
portant  aspect  of  this  behavior  is  that  the  estimation  technique  that  is  devel¬ 
oped  performs  best  when  there  is  chaos.  Because  the  sensitivity  caused  by 
the  presence  of  chaos  makes  the  sequence  of  iterates  generated  by  the  map 
extremely  sensitive  to  changes  in  the  value  of  the  parameter  ft  the  param¬ 
eter  can  be  estimated  with  great  accuracy.  If  the  map  is  not  in  the  chaotic 
region,  one  cannot  expect  the  estimator  to  perform  as  well. 

3.  The  Parameter  Estimation  Problem  for  Maps 

Consider  a  discrete-time  chaotic  system  described  by  equation  (1),  where 
the  map  F  is  an  implicit  function  of  one  or  more  parameters  defined  by  the 
parameter  vector  ft  The  problem  is  to  estimate  the  parameter  vector  from  a 
finite-time  observation  of  the  sequence  generated  by  the  system.  A  param¬ 
eter  estimator  for  a  general  map  is  a  function  that  maps  a  state  vector  se¬ 
quence  of  a  given  length  into  an  estimate  of  the  parameters  of  the  map: 

ft=0[X']  .  (5) 

The  parameter  estimator  is  a  function  of  the  data  vector  and  is  therefore 
itself  a  random  variable.  The  observations  of  the  sequence  can  be  imprecise 
because  of  additive  noise  or  quantization.  Also,  there  may  be  system  noise 
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present  in  which  each  successive  state  of  the  system  is  perturbed  by  a 
random  variable.  The  case  of  system  noise  is  not  considered  in  this 
development. 

Certain  attributes  of  a  good  parameter  estimator  are  desired.  First,  the  esti¬ 
mator  should  be  nearly  unbiased;  that  is,  the  expectation  of  the  estimate 
should  equal  the  correct  parameter.  Second,  the  estimator  should  yield  esti¬ 
mates  that  are  consistently  close  to  the  correct  parameter.  Although  many 
possible  criteria  exist  for  evaluating  the  performance  of  an  esdmator,  the 
usual  measure  of  performance  is  the  variance  of  the  estimated  values.  In 
this  report,  the  usual  minimum  variance  criterion  is  used,  but  an  important 
point  should  be  mentioned.  Because  the  treatment  given  here  is  for  small 
noise  amplitudes  and  uses  local  linear  approximations  to  the  maps,  the  im¬ 
pression  might  be  that  there  exists  no  efficient  estimation  procedure  for 
higher  noise  amplitudes.  This  is  not  necessarily  so;  in  fact,  it  appears  likely 
that  the  parameter  of  the  map  can  be  extracted  efficiently  with  higher  noise 
levels,  but  the  estimator  would  have  to  be  more  complex.  Furthermore,  the 
existence  of  a  threshold  of  noise  amplitude  beyond  which  the  optimal  esti¬ 
mator  considered  here  deviates  from  the  theoretical  performance  does  not 
preclude  its  use  in  these  higher  noise  levels.  The  estimator  described  in  this 
report  is  therefore  optimal  only  for  sufficiently  small  noise  amplitudes,  but 
for  uncorrelated  noise  it  will  be  shown  to  be  the  best  estimator  that  can  be 
found  in  the  strict  sense  that  it  minimizes  the  variance  of  the  estimate  over 
the  set  of  all  possible  estimation  functions  defined  on  a  given  data 
sequence. 

4.  Some  Parameter  Estimators  for  the  Logistic  Map 

A  straightforward  way  to  estimate  the  parameter  of  the  logistic  map  is  sim¬ 
ply  to  take  the  ith  estimate  of  the  parameter  as 


<6) 

This  is  equivalent  to  choosing  the  estimate  such  that/fx^,)  =  jq.  The  prop¬ 
erties  of  this  estimator  are  now  briefly  considered.  First,  the  bias  in  the  ex¬ 
pectation  of  the  /=1  estimate  of  0  is  given  by 

<7> 

where  the  tildes  over  the  variables  indicate  that  these  are  the  underlying 
points  generated  by  the  map  unperturbed  by  the  noise,  and  is  the  noise 
perturbation.  (Computing  the  bias  for » =  1  is  really  general  for  all  /  and  is 
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done  only  to  simplify  the  notation.)  This  expectation  can  be  evaluated  for 
small  noise  amplitudes  by  using 


[(^o  +  ^oX1  -  ^0  -  4o)]~l  = 


1  (l-2x0)£o-£02 


^o(l-io)  [io(l-^o)] 
The  bias  in  the  estimate  is  then  given  by 


(S) 


£10,-01  =  77 


[*o(l -■*<>)] 


rO3  , 


(9) 


where  o2  is  the  noise  variance.  The  estimator  is  therefore  unbiased  to  first 
order.  (For  the  estimator  to  be  biased  in  the  first-order  analysis,  the  bias 
would  have  to  be  directly  proportional  to  the  first  or  lower  power  of  <r.) 

These  estimates  for  the  parameter  are  generally  combined  in  some  manner 
to  converge  to  the  correct  value  as  the  sample  size  is  increased;  some  algo¬ 
rithms  just  add  them,  but  it  is  obvious  that  the  variance  of  each  individual 
estimate  is  dependent  upon  the  state  of  the  map.  A  more  accurate  estimate 
is  therefore  obtained  by  using 


0=1  . 


(10) 


where  the  wt  form  a  weight  vector  to  minimize  the  variance  of  the  estimate. 
The  coefficients  are  then  to  be  determined.  A  first-order  approach  to  choos¬ 
ing  the  Wj  approximates  the  error  in  the  i*  estimate  of  0  as 


1 


(11) 


which  is  reasonable  considering  the  functional  form  of  the  map.  The  mini¬ 
mization  of  the  variance  of  this  estimator  using  Lagrange  multipliers  based 
on  the  above  intuitive  expression  for  the  error  then  yields 


w*  =  with  vk  =  Jt4_,  (1  -  xk_x )  . 

i 


(12) 


(The  technique  for  applying  Lagrange  multipliers  to  this  type  of  problem  is 
described  later  in  the  development  of  the  optimal  estimator.)  This  estima¬ 
tion  technique  is  in  fact  equivalent  to  least-squares  fitting  a  second-order 
polynomial  to  an  embedding  of  pairs  of  points  from  the  data  sequence.  It  is 
also  the  same  as  retaining  only  the  diagonal  terms  in  the  covariance  matrix 
for  the  optimal  estimator  to  be  developed  here,  and  using  a  crude  estimate 
for  the  variance. 


Figure  4.  Embedding 
reconstruction  of 
logistic  map  for  0= 3.7. 


The  method  by  which  both  map  estimators  and  differential  equation  estima¬ 
tors  arc  usually  implemented  is  equivalent  to  the  crude  estimator  described 
above.  This  equivalence  is  important  because  it  shows  that  the  common 
procedure  for  estimating  equations  does  not  nearly  achieve  the  performance 
that  is  possible.  An  implementation  of  a  singular  value  decomposition 
(SVD)  algorithm  is  usually  used  in  the  following  manner  to  do  least- 
squares  fitting,  and  is  sometimes  described  as  minimizing  the  forward  pre¬ 
diction  error  for  maps  (as  well  as  differential  equations).  The  standard 
implementation  of  the  SVD  algorithm  is  capable  of  least-squares  fitting  an 
/»*  order  polynomial  to  an  embedded  data  set  in  function  space.  In  the 
standard  implementation,  neighboring  pairs  of  data  points  are  embedded  in 
a  Cartesian  plane  so  that 


.  (13) 

This  does  not  reconstruct  the  attractor  of  the  system  as  with  time  delay  em¬ 
bedding  techniques;  it  reconstructs  the  map  in  function  space  or,  in  this 
case,  in  the  map  plane.  This  reconstruction  is  illustrated  in  figure  4,  which 
shows  5000  pairwise  embedded  pouts  in  the  map  plane  for  the  logistic  map 
with  Q  -  3.7.  Independent  Gaussian  noise  perturbations  with  a  standard  de¬ 
viation  of  <r=  0.01  were  added  to  each  point  in  the  data  sequence  produced 
by  the  map  before  the  points  were  paired  and  embedded.  The  graph  of  the 
unperturbed  map  function  is  also  shown  as  a  solid  curve  in  the  figure.  The 
effects  of  the  noise  are  evident  in  the  fuzzy  picture  of  the  logistic  map  (an 
inverted  parabola)  that  is  produced.  The  effects  of  the  attractor  and  of  the 
singularity  distribution  of  the  density  are  also  apparent:  The  reconstructed 
function  extends  only  over  a  portion  of  the  unit  interval  (only  over  the 
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attractor),  and  there  are  regions  of  visibly  greater  pomt  density  near  the 
most  prominent  singularities.  The  SVD  algorithm  is  then  useu  to  find  the 
polynomial  of  a  given  order  that  fits  the  embedded  points  with  the  lowest 
mean-square  emr,  For  the  logistic  equation  with  the  SVD  algorithm  con¬ 
strained  to  a  second-order  polynomial,  this  is  the  same  as  minimizing  the 
sum  of  the  squares  of  the  vertical  distances  to  the  estimated  logistic  map 
function. 

This  curve  fitting  approach  will  now  be  shown  to  be  equivalent  to  the  esti¬ 
mator  using  the  weighting  of  pairwise  estimates  developed  previously.  The 
^-squared  error  over  all  embedded  points  to  be  minimized  is 


where/*  is  the  estimated  logistic  map  functional.  The  condition  for  mini- 
‘  mizing  the  ^-squared  error  is  thus  given  by 

=  0  .  (15) 

/ 

With  the  use  of  the  expression  for  the  estimated  logistic  map,/*(r,_i)  = 
#JCj_i(l  and  the  definition  v,-  =  jr,_i(l  -Xi- 1),  this  condition  yields 

06) 

2>. 

for  the  estimated  value  of  the  parameter.  With  =  (x/v-)  equation  (16) 
becomes 

0  =  %V  .  (17) 

2>. 

With  the  value  of  the  weight  given  in  equation  (12),  equations  (17)  and  (10) 
are  identical.  The  least-squares  fit  to  the  functional  form  of  the  map  is 
therefore  equivalent  to  the  crude  estimator  considered  previously. 


5.  The  Cramer-Rao  Bound  for  Chaotic  Maps 

The  Cramer-Rao  bound  [5]  for  the  lowest  attainable  mean-squared  error  is 
now  calculated  for  the  logistic  map.  The  calculation  shows  that  the  error 
bound  decreases  exponentially  with  the  length  of  the  data  vector.  This  is 
because  of  the  sensitivity  of  the  map  to  small  changes  in  the  parameter 
value.  If  a  small  change  in  the  parameter  has  a  large  effect  on  the  data  vec¬ 
tor,  one  would  expect  to  be  able  to  estimate  the  parameter  very  accurately. 
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The  Cramer- Rao  bound  for  the  variance  of  a  single  parameter  6  is  given  by 
[5] 


Q  2  £ _ £ _ 

•  Ep#logdX|0)f  ’ 


(18) 


where  the  expectation  is  taken  over  all  possible  data  vectors  X,  and  <7(Xl$ 
is  the  conditional  density  of  the  data  vector  at  the  parameter  value  6.  The 
conditional  density  for  normally  distributed  noise  is  dependent  on  6  only 
through  the  mean  and  is  given  by 


<7(XI0)  =  n 

i 


1  -Mi  (»»*  no1 

V2ff<7 


(19) 


Taking  the  derivative  in  equation  (18)  yields 


where  the  subscript  denotes  differentiation  with  respect  to  6.  If  wc 
assume  that  the  noise  vector  has  uncorrelated  components,  the  expression 
becomes 


(21) 


which  is  the  desired  Cramer-Rao  bound  on  the  variance  of  the  parameter 
estimate.  The  derivative  can  be  evaluated  by  using  the  expression  for  the 
logistic  map,  and  differentiating  to  obtain  a  recursion  for  the  derivative, 

<^-W#[/*w(1“/*m)]+/*m(1-A*m)  •  (22) 

Expression  (21)  for  the  Cramer-Rao  bound  therefore  roughly  means  that  an 
efficient  estimator  (for  which  eq  (18)  is  an  equality)  has  an  error  that  is  in¬ 
versely  proportional  to  the  sensitivity  of  the  mean  values  of  the  data  points 
to  changes  in  6.  A  plot  of  the  magnitude  of  the  6  derivative  of  the  tenth 
iterate  of  the  logistic  map  at  6=  3.7  versus  initial  conditions  is  shown  in 
figure  5.  For  the  points  where  the  magnitude  of  the  derivative  is  large,  the 
Cramer-Rao  bound  is  small.  Note  that  the  derivative  is  also  sensitive  to  the 
initial  condition  Xq.  This  means  that  for  some  values  of  jcq,  small  changes  in 
6  will  not  greatly  affect  the  data  sequence.  At  these  values  it  is  impossible 
to  obtain  a  good  estimate  of  the  parameter  value.  Because  the  bound  de¬ 
creases  exponentially  with  n  for  a  typical  trajectory,  equation  (21)  indicates 


Figure  5.  Magnitude  of 
0  derivative  of  tenth 
logistic  map  iterate. 
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that  an  efficient  estimator  would  be  far  superior  to  those  considered  in  the 
previous  section.  The  existence  of  an  efficient  estimator  is  now  demon¬ 
strated,  and  practical  methods  for  its  implementation  are  given.  It  is  also 
shown  that  although  uncertainty  in  the  exact  values  of  x  limits  the  accuracy 
of  the  estimator  in  the  presence  of  noise,  it  is  still  superior  to  the  ones  that 
have  been  used  before. 


6.  Optimal  Parameter  Estimator  for  One-Dimensional 
Maps 

The  Cramer-Rao  bound  suggests  that  an  estimator  for  the  logistic  map  may 
exist  such  that  the  error  in  the  estimated  parameter  decreases  exponentially 
as  a  function  of  the  number  of  samples.  It  is  now  shown  that  such  an  effi¬ 
cient  estimator  does  exist  and  that  it  achieves  the  Cramer-Rao  limit,  but 
only  for  certain  chaotic  trajectories.  Practically,  the  estimator  is  better  than 
the  usual  methods  for  all  data  blocks,  but  the  introduction  of  uncertainties 
in  the  mean  value  of  the  data  vector  by  additive  noise  causes  the  perform¬ 
ance  to  fall  below  the  Cramer-Rao  limit  for  some  data  vectors. 

As  a  first  step  in  finding  the  most  efficient  estimator,  assume  that  the  noise 
variance  is  small  enough  so  that  the  map  can  be  linearized  in  the  region  of 
interest.  First,  for  sufficiently  small  noise  amplitudes,  consider  the  estima¬ 
tor  formed  by  a  linear  combination  of  estimator  functions  operating  only  on 
pairs  of  data  points  at  %  time: 


&  =  ][lwvGy(xi’xj)  •  (23) 

ij 
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Because  any  estimator  can  be  expanded  in  a  Taylor  series  for  small  noise 
amplitude  as  a  linear  combination  of  functions  of  each  sample,  it  can  be  ex¬ 
panded  as  functions  of  data  pairs.  This  expression  is  therefore  general 
enough  to  represent  any  small-noise  estimator  for  the  data  block.  First,  con¬ 
sider  an  estimate  of  the  parameter  based  on  the  Qcijcj)  data  pair  for  a  given 
data  sequence.  (From  this  point  on  consider  a  hypothetical  fixed  data  se¬ 
quence  and  drop  the  functional  notation  used  in  eq  (23).)  The  only  estima¬ 
tor  that  is  unbiased  is  the  one  that  chooses  the  parameter  to  be  such  that  fV 
=  xy  passes  through  the  embedded  (x,-,xy)  pair.  (This  notation  means  that  the 
(J  -  0th  iterate  of  the  map  operates  on  the  data  point  x,  to  produce  the  value 
Xj.)  It  is  true,  however,  that  there  may  be  multiple  values  of  ft  that  cause  co¬ 
incidence  of  this  iterate  of  the  map  with  the  embedded  data,  but  it  is  as¬ 
sumed  that  the  estimate  of  0  is  resolved  of  ambiguities  by  successive  appli¬ 
cations  of  the  technique  with  lower-order  iterates  of  the  map. 

If  the  map  is  linearized  in  the  region  of  the  embedded  data  pair,  the  estimate 
of  Bean  be  written  in  terms  of  derivatives  of  the  map  iterates  with  respect  to 
both  x  and  to  ft.  The  estimate  is  given  by 

9  ,  (24) 

f.« 

where  fij\i  is  the  derivative  with  respect  to  Xj  of  the  (j  -  0th  iterate  of  the  map 
operating  on  point  x;,  and  is  the  ft  derivative.  The  geometrical  meaning 
of  equation  (24),  as  well  as  some  intermediate  steps  in  the  derivation,  are 
shown  in  figure  6.  The  embedded  noiseless  pair  (XitXj)  is  perturbed  to  the 
observed  pair  location  (x„xy)  =  (*,•  +  Xj  +  <*•)  by  the  noise  vector  (£/,£y)  in 
the  function  space  (function  plane)  of  the  (/-0th  iterate  of  the  map.  (The  it¬ 
erated  map  function /v  passes  through  the  noiseless  pair.)  Now  SO  •  fOg  is 
the  vertical  displacement  of  the  function /#  required  for  a  given  small 
change  in  6  (given  by  SO)  to  make  the  iterated  map  pass  through  the  noise- 
perturbed  pair.  This  displacement  is  geometrically  given  by  the  vertical 
noise  perturbation  £y  minus  the  local  linear  approximation  to  the  vertical 
displacement  caused  by  the  horizontal  noise  perturbation,  which  is  given  by 
fVj&.  The  error  in  6  caused  by  the  noise  is  SO;  so  #,y  =  0+  SO  is  the  esti¬ 
mate  of  ft.  Using  equation  (24),  the  covariance  of  the  pairwise  esk 'mates  is 

Cov[e4 , 0„] = -  f‘s-  -  f“tS*  +  •  <25> 

To  expedite  the  derivation,  it  is  now  shown  that  because  the  map  is  iterated 
sequentially  to  produce  the  data  sequence,  any  linear  combination  of  three 
pairwise  estimators  that  are  fully  linked  is  degenerate.  Fully  linked  pairwise 
estimators  are  defined  by  their  connectivity  graph,  an  example  of  which  is 
shown  in  figure  7(a).  (Fig.  7(b)  and  (c)  are  for  later  reference.)  In  this  graph, 
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Figure  6.  Graphical 
derivation  of  pairwise 
estimator. 


Figure  7.  Graphs 
depicting  (a)  linked 
triplets  of  pairwise 
estimators,  and  (b  and 
c)  two  implementa¬ 
tions  of  the  optimal 
estimator. 
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a.  Linked  triplets: 


*0  *1  *2  *3  *4  *5  *  *7  *8 


c.  Single  step  pairwise: 


an  arrow  connecting  two  data  points  indicates  a  pairwise  estimator  operat¬ 
ing  on  the  data  points.  A  linear  combination  of  three  pairwise  estimators 
such  that  a  complete  circuit  of  all  three  points  can  be  made  is  said  to  be 
fully  linked,  or  to  form  a  linked  triplet.  The  degeneracy  of  a  linked  triplet  is 
readily  shown  by  computing  the  determinant  of  the  matrix  of  coefficients  of 
each  independent  Gaussian  random  variable  in  equation  (24)  for  the 
(jl),  (//)]  triplet  of  pairwise  estimates: 


det[L]  =  det 


17*, 

0 
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(26) 
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Expanding  the  determinant  gives 


det[L]  = 


-/V*y 

/WyV"«  fWJk.»f% 


and  using  the  identity 


(27) 


fjfJk.j=fikj  (28) 

and  equation  (27)  yields  det[L]  =  0.  The  linked  triplet  is  therefore  degener¬ 
ate  and  can  be  reduced  by  deletion  of  any  one  pairwise  estimator  without 
affecting  the  quality  of  the  estimate.  Continuing  in  this  manner,  a  reduced 
set  of  estimators  can  be  found  for  any  linked  set  of  three  or  more  estimators. 
This  reduced  set  can  be  any  set  of  estimators  on  pairs  of  data  points  such 
that  there  are  no  linked  triplets,  but  the  set  must  contain  all  the  sequence 
points.  The  estimate  for  a  given  data  vector  can  be  written  as  a  linear  com¬ 
bination  over  the  reduced  set  as 


and  the  variance  is  minimized  by  the  Lagrange  multiplier  equation 


<?* 


'Lem 

.  i.J 


(30) 


with  the  constraint  equation  ensuring  an  unbiased  estimate  given  by 


•  (31) 


Here  C  =  [Cy]  is  the  covariance  matrix  of  the  pairwise  estimators  compos¬ 
ing  the  estimator  on  the  full  data  vector.  Equation  (30)  with  constraint  equa¬ 
tion  (31)  has  the  solution 


*  Sc"„  • 


k,l 


(32) 


and  with  the  use  of  the  constraint  equation,  the  value  of  the  minimal  vari¬ 
ance  is 
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This  equation  is  true  only  for  the  true  covariance  matrix  derived  above.  For 
other  weight  coefficients,  the  variance  must  be  calculated  using 

Var[e]  =  XQwiWy.  ,  (34) 

where  C  is  the  true  covariance  matrix. 

The  theoretical  error  variance  for  three  different  estimators  is  shown  in  fig¬ 
ure  8.  The  worst  performance  (upper  curve)  is  given  by  the  SVD  implemen¬ 
tation  of  the  least-squares  technique,  equivalent  to  the  crude  estimator  con¬ 
sidered  previously.  The  middle  curve  is  for  an  estimator  that  uses  the  cor¬ 
rect  values  for  the  variance  of  the  estimates,  but  only  uses  the  diagonal  form 
of  the  covariance  matrix  in  minimizing  the  variance  of  the  overall  estimate. 
This  estimator  might  be  more  practical  to  use  in  situations  prohibiting  the 
inversion  of  a  large  and  possibly  ill-conditioned  covariance  matrix.  The 
best  performance  curve  (lower  curve)  is  for  the  optimal  technique  devel¬ 
oped  here.  The  optimal  estimator  is,  however,  degraded  relative  to  the 
Cramer-Rao  bound  for  some  trajectories  because  of  the  effects  of  the  x 
derivative  term  in  equation  (24).  This  x  derivative  term  accounts  for  the  un¬ 
certainty  in  the  mean  data  vector,  which  leads  to  an  uncertainty  in  the  cova¬ 
riance  matrix,  and  thus  makes  it  impossible  to  attain  the  Cramer-Rao  bound 
for  all  data  vectors. 

7.  Implementing  the  Optimal  Estimation  Algorithm 

The  estimator  described  above  attains  the  Cramer-Rao  LMSE  bound  only 
for  some  sequences.  Also,  if  the  estimator  is  used  on  long  sequences,  the 

Figure  8.  Theoretical  8.0 
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theoretical  performance  curves  are  no  longer  valid  because  the  linear  ap¬ 
proximations  become  inadequate.  It  is  apparent,  however,  by  inspection  of 
the  performance  curves,  that  the  optimal  estimator  has  several  minima,  and 
this  property  can  be  used  in  a  more  complex  estimator  that  uses  the  optimal 
estimator  as  a  block  estimator  and  then  combines  the  blocks  to  reduce  the 
error.  There  is  also  a  high  probability  for  a  long  data  sequence  that  x  will 
eventually  fall  within  the  central  minimum  of  the  performance  curve  for  a 
higher  order.  The  parameter  can  then  be  detected  rapidly  with  exponentially 
decreasing  eiror  until  the  width  of  the  minimum  becomes  too  small. 

Two  possible  implementations  of  the  optimal  estimator  are  now  outlined: 

(1)  In  the  predict-and-correct  implementation,  the  successive  iterates  of  the 
map  arc  used  to  make  finer  estimates  of  the  parameter.  Because  the  higher- 
order  iterates  of  the  map  have  ambiguous  values  of  6,  the  earlier  values  are 
used  to  both  reduce  the  variance  in  the  weighted  sum  and  resolve  the  ambi¬ 
guities  in  6.  Figure  9  illustrates  the  ambiguity  problem.  This  figure  shows 
that  for  a  given  value  of  the  tenth  iterate  of  the  point  xQ  =  1/2,  it  is  in  general 
impossible  to  determine  unambiguously  the  value  of  the  parameter  6,  be¬ 
cause  many  values  of  6  will  produce  the  same  value  of  the  tenth  iterate.  The 
predict-and-correct  implementation  has  the  advantages  of  functioning  in  an 
intuitively  obvious  manner  and  being  easy  to  implement  The  higher  order 
iterates  arc  used  to  zero  in  on  the  parameter  because  they  are  more  sensitive 
to  changes  in  the  parameter.  Unfortunately,  since  higher  order  iterates  do 
not  have  explicit  solutions  for  the  parameter,  this  estimator  uses  a  linear 
prediction  of  6  based  on  the  last  estimate  for  higher  orders.  This  method  is 
illustrated  in  graph  form  in  figure  7(b).  This  graph  illustrates  that  the 


Figure  9.  Tenth  iterate 
ofx,  s  1/2  versus  0, 
illustrating  ambiguous 
estimates. 
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predict-and-correct  estimator  is  constructed  by  combining  pairwise  estima¬ 
tors  with  the  same  xo,  but  using  progressively  higher  iterates  of  the  map. 

(2)  In  the  single-step  pairwise  implementation,  the  degenerate  nature  of  the  es¬ 
timates  is  taken  advantage  of  to  produce  a  simple  implementation  without 
ambiguities.  The  pairwise  estimators  used  are  simply  the  one-step  estima¬ 
tors,  and  the  effect  of  combining  them  is  to  give  the  linear  combination  the 
required  sensitivity  to  changes  in  9.  Remarkably,  the  sensitivity-to-param- 
eter  effect  is  reflected  in  this  implementation  in  the  same  way  as  when 
higher  older  iterates  are  directly  used  as  with  the  predict-and-correct  tech¬ 
nique.  This  implementation  circumvents  the  necessity  of  the  linear  predic¬ 
tion  needed  for  the  predict-and-correct  technique,  because  there  is  an  ex¬ 
plicit  solution  for  9  with  no  ambiguities.  This  technique  is  illustrated 
graphically  in  figure  7(c). 

The  single-step  pairwise  technique  has  been  implemented  on  the  computer 
and  tested  using  computer-generated  noise.  The  normalized  deviation  of  the 
parameter  error  variance  for  the  10-iterate  simulated  optimal  estimator  from 
the  theoretically  predicted  variance  is  shown  in  figure  10.  The  noise  distri¬ 
bution  used  was  uniform,  with  a  peak-to-peak  amplitude  of  0.001.  (The 
noise  variance  was  therefore  0.001/12.0.)  Since  the  simulated  variance  was 
averaged  over  1000  trials,  the  error  in  the  simulation  results  is  negligible.  It 
has  been  found  that  for  peak-to-peak  noise  amplitudes  of  up  to  about  0.01, 
this  estimator  performs  with  less  than  twice  the  predicted  error.  This  devia¬ 
tion  is  probably  entirely  due  to  the  local  linear  approximations  to  the  map 
equations. 

Figure  10.  Computer 
simulation  results  for 
optimal  estimator. 
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8.  Parameter  Detection  as  Information  Acquisition 

It  is  now  demonstrated  that  using  the  optimal  estimation  technique  for  the 
parameter  value,  parameter  information  is  extracted  at  a  finite  rate.  By  in¬ 
specting  the  form  of  the  Cramer-Rao  bound,  if  the  initial  point  xq  is  known 
with  great  accuracy,  the  estimate  of  the  parameter  has  an  accuracy  that  in¬ 
creases  exponentially  with  data  sequence  length.  This  behavior  is  also  true 
if  jco  falls  within  a  minimum  in  the  error  curve  for  the  higher  iterates  of  the 
map.  It  is  shown  in  this  section  that,  even  with  a  small  but  finite  noise  am¬ 
plitude,  this  exponential  convergence  of  the  estimate  to  the  correct  param¬ 
eter  value  can  be  described  by  a  finite  parameter  information  acquisition 
rate.  For  simplicity,  assume  that  the  initial  point  is  *o  =  1/2,  and  that  the 
noise  amplitude  is  small.  Using  only  the  IVth  iterate  of  the  map,  the  variance 
in  the  estimate  of  6  is  given  by 


Var0ov  = 


(35) 


where  o2  is  the  noise  variance.  If  the  (arbitrarily  small)  probability  of  error 
in  the  /fc*  digit  of  d  is  desired  to  be  p*  then  the  width  of  the  interval  that 
contains  the  real  value  of  0with  probability  1  -p*  is  given  by 


86\p:  =  s(p*)^Var[0OA,]  =  *  <36> 


Here  g(p*)  is  the  number  of  standard  deviations  of  the  density  function  that 
must  be  included  in  the  interval  of  1  -p*  certainty  so  that  the  probability  of 
error  is  p*.  The  number  of  digits  (in  the  fractional  pan  of  0)  detected  at  the 
desired  accuracy  is  thus  given  by 


10""' 


(37) 


Solving  for  the  number  of  digits  resolvable  yields 


"d 


(38) 


The  asymptotic  digit  reception  rate  is  therefore  given  by 


which,  for  a  discrete-time  sequence,  is  the  number  of  digits  received  as  a 
function  of  time.  Now,  since  the  map  is  chaotic,  the  derivative  with  respect 
to  6  is  asymptotically  exponential  in  N  and  can  be  written  as 

fON,,  =  A  lOaN  ,  (40) 

where  A  and  a  are  constants.  The  limiting  value  of  the  digit  reception  rate, 
or  information  rate  in  digits  per  sample,  is  therefore  given  by 


dnd 

dN 


a  . 


(41) 


The  value  of  a  can  be  determined  from  the  numerical  iteration  of  the  6  de¬ 
rivative  map  given  in  equation  (22). 

Because  the  optimal  estimation  technique  combines  the  samples  to  reduce 
the  statistical  variance,  the  information  rate  for  all  samples  combined  can 
be  expressed  as 


Of  course,  for  either  equation  (39)  or  equation  (42)  to  be  strictly  true  as 
n-+*>,  the  noise  variance  must  approach  zero.  For  finite  times,  however,  the 
estimator  will  converge  exponentially  and  the  information  acquisition  rate 
will  be  finite.  Because  the  digit  reception  rate  is  independent  of  noise  am¬ 
plitude  for  small  enough  noise  levels,  this  digit  reception  rate  can  be  inter¬ 
preted  as  detecting  the  parameter  information  contained  in  the  data  stream 
through  a  noisy  measurement  channel. 

The  Kolmogorov  entropy  [6]  for  a  discrete-time  dynamical  system  is  the 
rate  at  which  information  must  be  given  to  specify  successive  states  of  a 
dynamical  system  to  a  given  degree  of  accuracy  given  the  past  states  of  the 
system.  The  Kolmogorov  entropy  for  a  one-dimensional  map  [3]  is  given 
by 


tf  =  limllog|/°%|  .  (43) 

The  Kolmogorov  entropy  therefore  depends  upon  the  sensitivity  of  iterates 
to  changes  in  the  initial  conditions.  A  similar  function  in  terms  of  the  pa¬ 
rameter  6  can  be  defined  as 

//e  =  limllog|/°%|  ,  (44) 
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and  can  be  interpreted  as  the  additional  information  about  the  parameter 
that  can  be  extracted  from  the  /Vth  iterate  of  the  map  given  the  past  This  ex¬ 
pression,  written  by  analogy  with  the  expression  for  the  Kolmogorov  en¬ 
tropy,  is  equivalent  to  the  previously  developed  expression  for  the  informa¬ 
tion  rate.  The  main  difference  in  the  interpretation  between  the  8  entropy 
and  the  Kolmogorov  entropy  is  that  tie  refers  to  the  rate  at  which  parameter 
information  is  detected,  and  the  Kolmogorov  entropy  can  be  interpreted  as 
the  rate  of  refinement  of  knowledge  about  initial  conditions  [7].  Finally,  an 
expression  based  on  the  Cramer-Rao  bound  for  the  parameter  entropy  is 
given  by  equation  (42): 


It  is  expected  that  equations  (44)  and  (45)  will  converge  to  the  same  value 
in  the  limit  as  N—>°°  when  the  map  is  chaotic,  but  equation  (45)  is  more 
meaningful  as  a  finite-time  quantity.  Finite-time  computations  of  both 
entropy  functions  for  the  logistic  map  with  8  -  3.7  are  shown  in  figure  1 1. 
These  quantities  were  computed  for  the  initial  condition  xq  =  1/2  and  for  N 
=  10.  They  can  be  considered  as  coarse-grained  versions  of  the  limiting  val¬ 
ues  in  equations  (44)  and  (45).  In  the  limit  as  N-*oo,  the  oscillations  of  these 
functions  with  8  will  occur  on  an  arbitrarily  small  scale,  characterizing  the 
extreme  sensitivity  of  the  global  behavior  of  the  map  to  the  value  of  8.  In 
the  presence  of  small  amounts  of  system  noise,  however,  this  arbitrarily 
fine  structure  will  be  limited.  Also,  for  chaotic  maps,  the  entropies  will  be¬ 
come  independent  of  the  initial  point  xo  for  almost  all  xq  in  the  limit  as 
N~*oo.  This  entropy  formalism  underscores  that  for  the  values  of  8  for 


Figure  11.  Finite-time 
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which  the  map  is  chaotic,  the  parameter  can  be  extracted  rapidly  and  accu¬ 
rately,  even  in  the  presence  of  small  but  finite  noise  fluctuations.  This  hap¬ 
pens  because  of  the  chaotic  nature  of  the  data  sequence:  The  sensitivity  of 
the  data  vector  to  changes  in  9  allows  one  to  detect  6  accurately,  ever,  when 
the  data  vector  is  slightly  altered  by  additive  noise.  For  values  of  6  where 
the  map  is  not  chaotic,  however,  the  data  vector  is  not  as  sensitive  to 
changes  in  0,  and  the  parameter  9  cannot  be  extracted  as  rapidly  and  accu¬ 
rately.  Thus,  the  presence  of  chaos  greatly  enhances  one’s  ability  to  esti¬ 
mate  the  parameters  of  iterated  maps. 

9.  Comments  and  Conclusion 

The  exponential  convergence  of  the  optimal  parameter  estimator  consid¬ 
ered  here  occurs  only  for  certain  trajectories  of  the  logistic  map.  For  trajec¬ 
tories  that  pass  close  to  the  point  xo  =  1/2  within  the  length  of  the  data 
vector,  the  effect  of  the  x  derivative  term  in  equation  (24)  to  reduce  the  ef¬ 
fectiveness  of  the  estimator  is  not  too  severe.  More  precisely,  if  a  data  vec¬ 
tor  is  processed  using  the  optimal  estimation  technique,  the  estimator  will 
converge  exponentially  to  the  correct  parameter  for  m  time  steps,  where  m 
is  the  largest  value  of  k  for  which  the  initial  point  of  the  data  vector  xo  falls 
in  a  minimum  of f^k\x o).  Even  here  the  definition  of  a  minimum  of  the  iter¬ 
ated  map  is  left  deliberately  vague,  because  the  value  of  k  at  which  expo¬ 
nential  convergence  ceases  is  fuzzy.  How  close  the  trajectory  must  pass  to 
xo  -  1/2  depends  upon  the  length  of  the  data  vector  used.  For  N  =  30,  for 
example,  exponential  convergence  occurs  when  the  initial  point  xo  falls 
near  a  prominent  minimum  of  the  optimal  estimator  performance  curve  in 
figure  8.  In  the  limit,  as  the  length  N  of  the  data  vector  goes  to  infinity, 
these  minima  become  smaller  until  the  exponential  convergence  occurs 
only  in  zero  noise.  (This  is  because  the  regions  over  which  the  high-order 
iterates  of  the  logistic  map  have  near-zero  slope  become  very  small,  and 
cannot  be  expected  to  enhance  the  estimate  in  a  practical  implementation.) 
Practically,  however,  for  a  finite  data  vector  length,  it  is  possible  to  detect 
the  parameter  with  exponentially  increasing  accuracy,  even  in  the  presence 
of  finite  additive  noise  for  these  trajectories.  It  is  also  apparent  from  the 
performance  curve  of  figure  8  that  even  when  exponential  convergence  is 
damped,  the  estimator  converges  more  rapidly  than  the  expected  1  l<fN  con¬ 
vergence  for  an  estimator  that  relies  on  convergence  of  a  sum  of  estimates 
to  the  mean.  t 

To  illustrate  the  importance  of  the  values  of  xo  with  zero  slope,  the  applica¬ 
tion  of  the  optimal  estimation  technique  is  considered  for  the  tent  map. 
The  tent  map  is  a  symmetric  piccewise-linear  triangular  map  with  both 
sides  having  constant  slope,  and  is  described  by  the  equation  xn+i  = 
1  -  2a\xn  -  1/21.  There  are  therefore  no  values  of  xo  for  which  there  is  zero 
slope.  For  simplicity,  consider  the  variance  of  an  estimate  of  the  gain  pa- 
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rameter  a ,  or  equivalently  of  b  =  2a,  based  only  on  the  first  and  last  points 
in  a  data  vector  of  length  N.  (It  is  assumed  that  ambiguities  have  been  re¬ 
solved  using  lower  order  pairwise  estimates.)  The  value  of  the  IVth  iterate  of 
the  tent  map  can  be  written  as /^(xo)  =  b^A,  where  A  is  the  distance  of  the 
initial  point  xo  from  the  nearest  point  x  where  the  iterated  map  function 
/<*>(x)  is  zero.  (The  iterated  map  function  consists  of  2N~l  triangular 
humps,  and  thus  2N  line  segments  pieced  together,  so  A  is  just  the  distance 
from  the  base  of  the  line  segment  containing  the  point  .xo.)  Using  equation 
(24),  the  variance  of  an  estimate  of  b  can  be  written  as  <*In  =  E[qw  ~  bf  = 
E[-{bF  -  §j)/(x obfb^~1)],  where  the  initial  point  has  been  chosen  for  con¬ 
venience  to  lie  on  the  first  line  segment  of  the  iterated  map  function  so  that 
A  =  xo.  (This  restriction  will  not  affect  the  N  dependence  of  the  computed 
variance  because  the  magnitude  of  the  slope  of  the  iterated  function  is  con¬ 
stant.)  Letting  c  =  b2  for  notational  clarity  and  carrying  out  the  calculation 
for  the  variance  yields  °0N  =  (cPc/x?0)[(  1  +  1  f(^)/N2].  Now  it  is  apparent 
from  the  forn^of  the  term  in  square  brackets  that,  for  fixed  xo,  the  error 
(standard  deviation)  decreases  only  inversely  with  N,  and  not  exponentially. 
This  slowed  convergence  at  all  xo  is  caused  by  the  constant  slope  of  the  tent 
map,  and  thus  the  absence  of  any  regions  where  the  map  or  its  iterates  have 
a  near- zero  slope.  The  effect  of  the  x  derivative  term  in  equation  (24)  is 
therefore  severe  enough  to  destroy  the  exponential  convergence  rate  of  the 
estimator. 

! 

The  tent  map  is  also  useful  to  illustrate  the  equivalence  of  equations  (44) 
and  (45)  for  the  parameter  acquisition  rate.  As  previously  stated,  these  two 
expressions  are  expected  to  converge  to  the  same  value  for  chaotic  maps, 
and  this  can  be  shown  to  be  true  analytically  for  the  tent  map.  Also  stated 
previously,  these  expressions  are  strictly  true  only  if  the  noise  variance  is 
zero,  so  for  the  purpose  of  this  derivation,  a  zero  noise  variance  is  assumed. 
(This  is  usually  the  case  in  mathematical  derivations  of  such  quantities.  Al¬ 
ternatively,  if  the  initial  point  xo  is  assumed  to  be  known  accurately,  the  as¬ 
sumption  of  zero  noise  variance  is  also  rendered  unnecessary,  because  the 
estimates  will  converge  exponentially  for  all  chaotic  trajectories.)  Assume 
again  for  convenience  that  the  initial  point  xo  lies  on  the  first  (with  leftmost 
endpoint  at  x  =  0)  line  segment  composing  the  bfi1  iterate  of  the  tent  map,  so 
that  equation  (44)  yields  flj,  =  Hm(l/AQlog  \NbN~lxQ\.  Taking  the  limit 
yields  «*=  log  b.  If  equation  (45)  is  used  r'milarly,  then  the  expression  Ht, 
=  ^m  ( l//V)log(Z^.  j (Xq/#-1  )2) 1/2  is  obtained.  The  summation  in  the  square 
root  can  be  expressed  as  Xq^j/V-1  =  Xq(9/9c)c(3/9c)LjV0c',  so  that  with 
the  closed-form  expression  for  a  finite  power  series,  this  summation 
becomes 


d_  d  fc**1  -ll  iljN  +  tfc" 
dc  dc  c  - 1  0  c-1 


(2N  +  3)cA,+1  - 1  2(c*+2-c)‘ 
(c-1)2  +  (c-1)’ 
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The  first  term  in  square  brackets  dominates  for  large  N,  so  the  expression 
for  the  entropy  becomes  Hb  =  (\/N)\og  [xq(N  +  1  )t^/(b2  -  1)1/2],  so 

that  in  the  limit  //$  =  log  b.  Equations  (44)  and  (45)  thus  yield  identical  ex¬ 
pressions  for  the  parameter  entropy  in  the  limit  of  large  N  for  the  tent  map. 
A  direct  calculation  of  these  two  quantities  is  possible  for  the  tent  map  be¬ 
cause  it  is  piecewise  linear,  but  the  equivalence  of  the  two  equations  is 
caused  by  the  exponential  increase  in  the  sensitivity  of  the  iterates  to 
changes  in  the  parameter,  so  that  equations  (44)  and  (45)  should  yield 
equivalent  values  for  the  chaotic  trajectories  of  almost  all  maps. 
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